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ABSTRACT 

In this paper, we derive a necessary and sufficient condition on the parameters of the Hypergeomet- 
ric distribution for weak convergence to a Normal hmit. We establish a Berry-Esseen theorem for 
the Hypergeometric distribution solely under this necessary and sufficient condition. We further 
derive a nonuniform Berry-Esseen bound where the tails of the difference between the Hypergeo- 
metric and the Normal distribution functions are shown to decay at a sub-Gaussian rate. 
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1 Introduction 



Consider a dichotomous finite population of size N having M individuals of type A and N — M 
individuals of type B. Suppose a sample of size n is drawn at random, without replacement from 
this population. Let X denote the number of 'type A'-individuals in the sample. Then, X is said 
to have the Hypergeometric distribution with parameters n,M,N, written as X ~ Hyp{n; M, N). 
The probability mass function (p.m.f) of X is given by, 

{/M\ IN-M\ 
" /a/T" if x = 0,l...,n 
(n) (1.1) 
otherwise, 

where, for any two integers r > 1 and s, 

if < s < r 

s\{r-sy. - - /-j^ 2) 

otherwise, 

with 0! = 1 and r! = 1 • 2 • • • r. Let f = ^ denote the sampling fraction and let p = ^ denote 
the proportion of the 'type A'-objects in the population. The Hypergeoemetric distribution plays 
an important role in many areas of Statistics, including sample surveys (e.g., finite population 
inference), statistical quality control (acceptance sampling plans), etc. Normal approximations to 
the Hypergeometric probabilities P(.; n, M, N) of 1)1.1(1 are classical in the cases where the sampling 
fraction / and the proportion p are bounded away from and 1; see for example Feller(1971). 
However, the extreme cases where f or p take values near the boundary values and 1 are very 
important in sample surveys and quality control applications. In this paper, we investigate the 
validity and the rate of Normal approximation to the Hypergeometric distribution allowing the 
parameters / and p to tend to any points in the interval [0, 1], including the boundary points. The 
main results of the paper give a necessary and sufficient condition on the parameters / and p for 
a valid Normal approximation. It is shown that a Normal limit for properly centered and scaled 
version of X holds if and only if 

iVp(l - p)/(l - /) ^ oo. (1.3) 

As a consequence of this, we conclude that for the Normal distribution function to approximate 
the distribution function of X, all four quantities, namely, (i) the number M (= Np) of 'type 
A'-objects, (ii) the number of 'type B'-objects, — M, (iii) the sample size n, as well as (iv) the 
size of the unselected objects — n in the population, must tend to infinity. 

We also investigate the rate of Normal approximation to the distribution of X. Note that X 
is the sum of a collection of n dependent Bernoulli random variables. In Section 2, we establish 
a Berry-Esseen Theorem on the rate of Normal approximation to the distribution function of X 
solely under the necessary and sufficient condition (1.3). It is shown that under (1.3) the rate 
of approximation is 0{[Np(l — p)f{l — /)]~^^^)- It is also shown in Section 2 that this rate is 
optimal and can not be improved. Note that the rate 0{[Np{l — p)f{l — f)]"^^"^) is equivalent 
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to the standard rate 0{n'~^^^) (for sums of n independent Bernoulli random variables, say) only 
when p is bounded away from and 1 and / bounded away from 1. However, for p and / close to 
these boundary points, the rate of approximation can be substantially slower. In such situations, 
the dependence of the Bernoulli random variables associated with X has a nontrivial effect on the 
accuracy of the Normal approximation. 

Under somewhat stronger conditions on / and p, we also derive a non-uniform version of the 
Berry-Esseen Theorem. The nonuniform bound shows that in the tails, the error of Normal ap- 
proximation dies at a sub-Gaussian rate for a wide range of values of / and p. As a corollary, we 
also derive an exponential (sub-Gaussian) probability inequality for the tails of X, which may be 
of independent interest. 

The rest of the paper is organized as follows. We conclude Section 1 with a brief literature 
review. Section 2 introduces the asymptotic framework and contains the results on the validity of 
the Normal approximation and the Berry-Esseen theorems. Proofs of all the results are given in 
Section 3. 

For results on Normal approximations to Hypergeometric probabilities in the standard cases 
where the sampling fraction / and the proportion p are bounded away from and 1, see Fener(1971). 
For general p and /, Nicholson (1956) derived some very precise bounds for the point probabilities 
P(.;n, M, N) using some nonstandard normalizations of the Hypergeometric random variable X. 
General methods for proving the CLT for sample means under sampling without replacement from 
finite populations are given by Madow (1948), Erdos & Renyi (1959) and Hajek(1960). For results 
on Berry-Esseen Theorems and Edgeworth expansions for the functions of sample means and U- 
statistics based on finite population observations, see Babu &: Singh (1985), Kokic & Weber (1990), 
Chen & Sitter (1993), Bloznelis (1999), Bloznelis & Gotze (2000), and the references therein. 



2 Main Results 

Let r be a positive integer valued variable and for each r G N (where N = {1, 2, ...}), let be 
a random variable having the Hypergeometric distribution with parameters (n^, Mr, Nr). Thus we 
consider a sequence of dichotomous finite populations indexed by r, with the population of objects 
of type A and the sampling fraction respectively given by, 

Pr = ^ and fr = ^ for all r G N. (2.1) 

To avoid trivialities, all through the paper, we shall assume that 

l<Mr<Nr, l<nr <Nr foT all r G N, and N'^ = o{l) r ^ oo. (2.2) 

Thus, Pr , fr^ (0, 1) for aU r G N. Let 

= NrPrQrfrii - fr), (2-3) 

where qr = 1 — Pr- The first result concerns the validity of the Normal approximation to the 
distribution of Xr- 
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Theorem 2.1: Suppose that (|^ holds and that Xr ~ Hyp{nr, Mr, Nr), r G N. Then there 
exists a Normal random variable W ~ N{^, cr^) for some /i G R and a G (0, cxd) such that 



Ar = sup 



<x] - P{W <x) 



as r — > oo 



(2.4) 



if and only if 



0"„ 



oo 



as r ^ oo. 



(2.5) 



When ()2.5() holds, one must have /U = and a = 1. 



Note that o"^ = rirPrqri^ ~ fr) = ^aT^ T^ar(Xr). Hence Theorem 2.1 shows that the Normal 
approximation to the Hypergeometric distribution holds solely under the condition that the variance 
of the Hypergeometric distribution goes to infinity with r. In particular, it is not necessary to impose 
separate conditions on the asymptotic behavior of the three sequences {nr}{r>i}^ {Pr}{r>i} and 
{/r}{r>i}- A necessary condition for 1)2. 5() is that — > cxd and (Nr — n^) — > oo as r ^ oo. This 
follows by noting that o"^ = nrPr^ri^ — fr) = (-^r — n'r)prqrfr ^ m.m{nr,Nr — rir} for all r > 1. 
Thus, for the Normal approximation to hold, both the sample size and the residual sample size 
[Nr — rir) must become unbounded as r — > oo. By interchanging the roles of Pr and with fr and 
(1 — fr), it follows that for the validity of the Normal approximation, we must also have 

Mr A {Nr - Mr) — > OO as r ^ oo, (2.6) 

i.e., the number of objects of type A and type B must go to infinity with r. 

In a seminal paper, Hajek (1968) obtained a necessary and sufficient condition for the CLT for 
finite population sums, assuming that 

rir A Nr — nr ^ oo as r — > oo. (2-7) 

The observations above imply that this is not a serious restriction; Indeed, in the cases where (|2.7|) 
fail, the CLT need not hold. 

Condition 1)2. 5|) also allows the proportion pr of 'type A'-objects in the population and the 
sampling fraction fr to simultaneously converge to the extreme points and 1 at certain rates. If 
the sequence {/r}{r>i} is bounded away from and 1 and (2.2) holds, then the CLT of Theorem 
2.1 holds if and only if (iff) 

A- = o{qrApr) as r ^ oo, (2.8) 

i.e., iff H2.6I) holds. Similarly, for {pr}{r>i} bounded away from and 1, the CLT holds iff 

^ = o{fr A {I - fr)) as r^oo, (2.9) 

i.e., iff (|2.7|) holds. However, when both {pr}{r>i} and {/r}{r>i} simultaneously converge to some 
limits in {0,1}, neither of ()2.8|) and ()2.9|) alone is enough to guarantee the CLT. For example if 
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fr ~ N-°- and pr - N'^ for some < a,b < 1, with a + b> 1, then ((THl) and (US)) hold but the 
Normal approximation of Theorem 2.1 is no longer valid. 

Next we obtain a refinement of (|2.4() by specifying the rate of convergence of to zero. 

Theorem 2.2: Suppose that Xj. ~ Hyp{nr, Mr, Nr), r G N, and that (^3]) holds. Then there 
exists a constant Ci G (0, oo) such that for all r G N, 

C 

Ar < — . (2.10) 

<Jr 



Theorem 2.2 is a uniform Berry-Esseen theorem that shows that under 1)2. 5() . the rate of Normal 
approximation to the Hypergeometric distribution is uniformly O as r ^ oo. When both the 

sequences {pr}{r>i} and {/r}{r>i} are bounded away from and 1, this rate is O ^rir , which is 
the same as the rate of Normal approximation for sums of rir independent and identically distributed 
(iid) random variables with a finite third moment. Although the Hypergeometric random variable 
Xr can be written as a sum of Ur dependent Bernoulli (pr) variables, the lack of independence of 
the summands does not affect the rate of Normal approximation as long as the sequence {pr}r>i 
is bounded away from and 1 and {/r}r>i is bounded away from 1; The rate becomes worse 
otherwise. 

A second important aspect of Theorem 2.2 is that the bound on holds under the same 
condition 1)2. 5(1 that is both necessary and sufficient for a Normal limit. Since ^'■~J^rP'' [g gup- 
ported on a lattice with maximal span a~^, it is not difficult to show that if (|2.5|) holds, then 
liminfr^oo A^crr > 0, i.e., there exists a constant C2 G (0, 00) such that 

Ar>— (2.11) 

for all but finitely many r's. Thus, the rate in Theorem 2.2 is optimal and can not be improved 
upon. 

The next result gives a non-uniform version of the Berry-Esseen theorem. To state it, let (f){-) 
and <!>(•) respectively denote the density and the distribution function of a standard Normal random 
variable, i.e., (pix) = ■^^exp(— ^), x G R and <I>(x) = (l){t)dt, x G R. Also let /(•) denote 
the indicator function. Define 

^, = ^(max(ai„2))-\ r > 1, (2.12) 



where air = ./'"'"f n and where 

— Jr j 



fr 

Then, we have the following result. 



fr • if /r ^ 2 
1-fr : if /r > i 
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Theorem 2.3: Suppose that ~ Hyp{nr, Mr, Nr),r £ N. Assume that r is such that 

5rar > 1. (2.13) 

Then there exists universal constants C^jC^ £ (0, oo) (not depending on r,nr,Mr and Nr) such 
that 

2 

< — exp (-C^x^X^ix)) (2.14) 

(Tr Xr[X) V / 

for all X G R, where Ar(x) = qrl{x < 0) +pr-^(ic > 0). 



< X 



(Tr 



Theorem 2.3 shows that the error of Normal approximation to the Hypergeometric distribution 
dies at a sub-Gaussian rate in the tails. The only condition needed for the validity of this bound is 
(|2.13|) . It is easy to check that 

for all r satisfying (|2.13|) . Hence, the bound in (|2.14)) is available for all r such that Or > 25. 

An immediate consequence of Theorem 2.3 is the following exponential (sub-Gaussian) proba- 
bility bound on the tails of X^- 

Corollary 2.4: Suppose that X^ ^ Hyp{nr, Mr,Nr),r G N. Then, there exist universal constants 
C5, Ce G (0, 00) (not depending on r, n^, Mr, Nr) such that for all r satisfying 1)2. 13|) . 



P 



(Tr 



> X ) < — 3 exp (^—CQx'^lpr A Qr]^^ for all x G (0,oo). 

{Pr ^ Qr) 



3 Proofs 

We now introduce some notation and notational convention to be used in this section. For real 
numbers x,y, let x A y = min{x,y} and x V y = max{x,y}. Let [xj denote the largest integer 
not exceeding x, x G R. For a G (0,cxd), write 4>a{x) = ^(/"(f) and ^a{x) = ^(f), x G R, for the 
density and distribution functions of a A^(0, a^) variable. Write (j)a = 4> and = ^ for a = 1. Let 

A;(x) = P f ^ < x) - «>(x), xGR. (3.1) 



\ Or / 

Let N = {1, 2, . . .}, Z+ = {0, 1, . . .} and Z = {. . . , -1, 0, 1, . . .}. 

For notational simplicity, we shall drop the suffix r from notation, except when it is important to 
highlight the dependence on r. Thus, we write n, M, N for rir, Mr,Nr respectively and set p = 
q = 1 —p and f = j^- We shall use C to denote a generic positive constant that does not depend 
on r. Unless otherwise stated, limits in order symbols are taken by letting r ^ 00. 

For proving the result, we shall frequently make use of Stirling's approximation (cf. Feller(1971)) 



! = \/2^e"'"+'™m'"+^ for all m G N, (3.2) 
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where the error term admits the bound 



1 1 

12m + 1 - - 



for all m G N. 



Also note that for g{y) = log y, y € (0, oo), the kth derivative of g is given by g^^\y) = ^ 



y G (0, oo), G N. Hence, for any /c G N and 5 G (0, 1) 



< 



(A;-l)! 



for all < Ixl < 5. 



(3.3) 



(1-5)^ 

For Lemma 3.1, let X ~ Hyp{n; M, N) for a given set of integers n,M,N G N with 1 < n < 
(A^ — 1), 1 < M < (A^ — 1)- Note that this notation is consistent with our convention of dropping 
the suffix r; X,n, M, A^ in Lemma 3.1 would subsequently represent Xr,nr, Mr, Nr for a fixed r G N 
for which (2.2) holds. Let 

X-np ^ 



/npq 



and afc„ 



(1 - f)Jnpq 



, < < n. 



(3.4) 



where f = jf, p = ^ and g = 1 — p. Lemma 3.1 gives a basic approximation to Hyper geometric 
probabilities solely under condition 1)3. 5p stated below. 



Lemma 3.1 Suppose that X ~ Hyp{n; M, N) for a given set of integers n,M,N G N such 
that 

< / < 1, < p < 1 and 6(np A nq) > 1, (3.5) 
where f = jji P = ^ and = 1 — p are as in (|3.4|1 . Then, for any given (5 G (0, ^], 



logP(A;;n,M, A^) 



1 



- log {2Tmpq{l - /)) + rl{k) 



(3.6) 



2(1-/) 2 

for aU A; G {0, . . . ,n} with |afc^„| < 5, where P{k;n,M,N) = P{X = k) (cf. (HHJ) and where the 
remainder term r*(A;) admits the bound 

'l 25 



K{h)\ < 



Gnpqil - 6){1 - f) 



+ 



(3.7) 



provided |afc „,! < 5. 



Proof: For /c G {0, 1,... ,n}, 



/7Vp\ / ATq \ 

P{k,n,M,N) = ^ ^ 
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n 



p'^q''-'' R{k,n,M,N), say. 



First consider tlie denominator of R{k;n, M, N). By (3.2) 

) = 

N 



i=i 



{N-n)\m 

g(e]V — ejV-n)g— " 



(1 _ /)iV(l-/)+| 

Similarly, the numerator of R(k; n, M, N) is given by 



k—l ■ n—k—1 

TTfi-— ) TT (1-—) - 



Npl y- Nq 

Q—'n.Q^Np — ^Np-k+^Nq — ^Nq-n+k 

^-^ k_'^Np-k+\ ^-^ _ n—k ^NQ—n+k+^ 



Note that by (3.4), 



fq 



J^ = f + Xk,n\l— and 



n — k 



fp 



Np Nq 
Hence R{k\ n, M, N) can be expressed as 

R{k; n, M, N) = exp{eNp - ^Np-k + ^Nq - eNq~n+k + ^N-n - eAr)(l - /)''^(^~-^)+^ 



X < 



X < 



1 - / + Xk,., 



[fp] 

Nq 



Nq[l-f+x^,„j4^\+^ 



Next write 

Zk,n 

e* 

Then it follows that 

log R{k;n,M, N) 



^ _ ^ , yk,n = ^_ J, and 

^Np — ^Np-k + ^Nq — ^Nq-n+k + ^N-n — ^N- 



- (Np{l - /)(1 - y,,„) + i) log(l - y,, 

■ {Nq{l -/)(!+ Zfc,„) + ^) log(l + Zk,n) 
log(l - /) 



- Ai- A2, s&y. 



Fix S £ (0, 1/2). By Taylor's expansion and 



A, 



Np{l - /)(1 - yk,n) + 2 ) log(l - 



Np{l- f){l-yk,n) + l^ (^-yk,n 



Vk^n 



+ rinik) 



-yk,n { Np{i - /) + 2 



--Np{l-f))+r2n{k), 



(3.12) 



where rin{k) and r2n{k) are remainder terms, defined by the equahty of the successive expressions. 
By (3.3), for all n,k satisfying |yA:,n| < 

|3 



\nn{k)\ < 



\yk,' 



{l-Sy 3! 



and, 



Np 



\r2n{k)\ < ^(l-/)|yfc,nr + 



Arp(l-/)(l-yfc,„) + - 



\rin{k)\. 



By similar arguments, 
A2 - 



7Vg(l-/)(l + Zfc,„) + - 
(Nq{l - /) + ^"l Zk,n + - 



log(l + Zk. 



•'k,n 



where for all n, fc, satisfying \zk^n\ ^ 



|r3n(A:)| <iV(7(l-/) 



I ^fc,n I 



iV(7(l-/)(l + %n) + 2 



+ r-3n(fc), 



I Zk^n I 



3(1 - 5r 



From, (nmi . inrn]) and (Hnil) . we have 



log i?(fe;n,M, iV) 



e*--log(l-/) 



)+ o 



iV9(l - /) - 2 



^Np(l- f)-]^^+r2n{k)+nn{k) 



= e*-ilog(l-/)-^^^ + r4.(fc) 
where for all n,k satisfying {\yk,n\ V |-Zfc,n|) ^ 

k4n(A;)| < \r2n{k)\ + \r3n{k)\ + ^\yk,n - Zk,n \ + ^ {yk,n + 

Next using Stirling's formula on the binomial term, we have 

g(fn — f fc— f^n-fc) I 



^2 



(3.13) 



(3.14) 



(3.15) 



(3.16) 



log' 



n 
,k. 



„kn—k 



log ■ 



nq - Xk,n\fnpq + log |l - ^Kn^j^ 
1 



y/2Trnpq 



np + Xk n^/npq + + Xk,n\ — 

np 



= e 



log y/27rnpq - A3 - A4, say. 



(3.17) 
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where e** = e„ - - e„_fc. Next write yk,n = Xk,n-^ ^ and Zk,n = ^k,n^J:^- Then, by arguments 
similar to and 

^3 = 



nq - Xk^n^/rrpQ. + ^ ) log ( 1 - Xk,n\f^ 
2/ V ' V 



and 



^4 



^;fe,n 1 + 2 j + 2 

where for all /c and n satisfying |yA:,n| V \zk^n\ ^ 



JT-P + Xk,nVnpq + 7T log U + 

2 J \ \ np 



\r5n{k)\ + \rQn{k)\ < t: \q\yk,nf + p\zk,nf + 



{1-5Y 

+ {np + -+ np\Zk,n\ j \Zk,nf 



nq + -+ nq\yk,n\ ) \yk,n\ 



Hence, as in ()3.16() . it follows that 



(3.18) 



n 



iog<; K IP' 9 



k „n~k 



e** - log y/27rnpq - „ + r7„(fc) 



(3.19) 



where for all n, fc satisfying V \zk^n\ < 



k7n(A:)| < 



1 1 

2 (4,n + yk,n) - {yl,n + 



+ k5n(A:)| + |r6„(A:)|. 



Note that 



fq + fp+{l-f)p+{l-f)q = 1, 
(/?)' + (/p)V ((1 - /)p)V ((1 - /)g)' = (1-2m)(1- 2(1 -/))<!, 

and by (3.4), yk^n = fqak,n, Zk,n = fpak,n, yk,n = pak,n, and = qak,n- Hence, it follows that 

1 1 11 

2 i\yk,n\ + \yk,n\ + kfc,n| + |4,n|) + ^ (^I,™ + 2/fc,n + 4,n + ^fc,n) < 2 1"^."! + 4«fc,n- (3-20) 

Now, combining 1)3. 8() . 1)3. 16() and (|3.18p and using (|3.2Up and the above identities, after some 
algebra, we get 



log P{k;n,M, N) 



where for all k,n satisfying \ak,n\ < 



2(1-/) 2 



- log(27rnpg(l - /)) + r*^{k), 



Kik)-e*-e**\ < \unik)\ + \r7nik)\ 



< -YWk,n\' [{I - f){fqY + (1 - WpY +p' + q' 
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I 3 
+ (1 + 5p)p'^ + (1 + -59)9^ 



(l-/)/2{(l + 5/(?)g2 + (i + 5/p)p2| 



+ 



(1-5) 



1, 



1 



2(5 



< 2""'=''^" + "^'«\4 + (l-5) 



3 r + \ak,n?npq [{ + l] \ \ +'^^^ ^ 



2 {i-sy 



(3.21) 



Note that for all /c,n satisfying \ak^n\ ^ 



fl - f) 

Np - k > Np - {np + (5(1 - /)npq') > np-^^ ^ > 



and 



fl - f) 

Nq - {n - k) > nq ^ ' > 0. 



Hence, by the error bound in Stirling's approximation, for all k, n with \ak^n\ < ^ and 6{npAnq) > 1, 

111 1 11 

+ 



e* > 
> 
> 



+ 



12Arp + l 12{Np-k) UNq + l 12{Nq - {n - k)) 12(iV - n) + 1 12A^ 

12A; + 1 12(n-A;) + l 

~(12Afp + l)(12(A^p - k)) ~ {UNq + l)(12(iVg - n + k)) 
1 1 



6Np{l - d){l - f) 6iVg(l-(5)(l-/) 
/ 

6npq{l - 5)il - fY 



e* < + + 



1 



1 



12(Af-n) + l 12iV 



< 



/ 



6npq{l - 6){l - fY 



e** < 



1 



1 



12n 12A: + 1 12(n -k) + l 



<0; 



(3.22) 



e** > 



1 



1 



1 



> 



n 



> 



1 



12n+l 12A; 12{n - k) - 12k{n - k) - 6npq{l - 6)' 
Hence, the lemma follows from (|3.2ip and the above inequalities. 



Lemma 3.2 Let g : R — > [0,oo) be such that 51 is | on {—00, a) and (7 is | on {a, 00) for some 
a G R. Then, for any A; G N, 6 G R and h G (0, 00), 



rb+hk 

Y.9{b + ih)<j g{x)dx + 2hg{xo), (3.23) 

i=o ^ 

where g{xo) = max{g{b + ih) : i = 0,1, . . . , k}. 
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Proof: For 6 > a, by monotonicity, 

k 



rb+hk 

h^g{b + ih) < hg{b) + g{x)dx . 



i=0 

For b < a, let ki = sup{« : b + ih < a} and bi = b + kih. Then, 

h^g{b + ih) < ^ / g{x)dx + hg{b + kih) 

rbi 

< / g{x)dx + hg{bi). 
Jb 

Hence, for b < a and k > ki, 

k ki k 

h^g{b + ih) = h'^g{b + ih) + h ^ g{b + ih) 

1=0 i=0 j=fci+l 

= hY,9ib + ih) + h £ + + 

i=0 j=0 

rbi i-bi+h+{k-ki-l)h 

< / fif(x)(i2; + /igf(6i) + /i(7(6i + /i) + / g{x)dx — hg{bi) 
Jb Jbi+h 

i-b+hk 

< / g{x)dx + 2hg{xo). 
Jb 

For b < a and /c < A;i, it is easy to check (using the arguments above) that bound 1)3. 23p trivially 
holds. This completes the proof of the lemma. 



Lemma 3.3 Let (l){x) = ^= exp( — ^), x G R. Then, for any h € (0, cxd), b G [0, oo), Jq G N, 



JO 



i=0 



2 



(j){x)dx 



(3.24) 



< ^! 

- 12 



b+joh+^ „ ( „ h h 

\<f> {x)\dx + (4 + /i) max <\(f> {x)\ : b < x <b + joh + - 

b-^ L 2 2 



Proof : Note that the function |0"(x)| = — l|<?!)(x) is even, and on [0,cxd), it is increasing on 
[1,3^/^] and decreasing on each of the intervals [0, 1) and (3^/^,oo), with the maximum value 
at X = and the minimum value at x = 1. First suppose that {b-^,b+{jo + ^)h)n{0,V3} = 0. 
Then, writing bi = b + ih, i > 0, and using Taylor's expansion, one can show that the leftside of 
(|3.24|) is bounded above by 



■ n 61 — TT ^ r\ "J hi — 77 



sup 10 {y)\ \ dx 



1 



j=0 ' 



10 



?/6(f),-|A + f) 



bi 

2 



V 



12 



< 



24 



JO 

E 



i=Q 

- 12 ^ 



6. 



+ 



6» + 



i=Q 



2 



Hence by two applications of Lemma 3.2, one can show that 



io+i 

i=0 



Oi 

2 



/•ft+jo/i+t „ „ h h 

< / {x)\dx + Amax{\(j) {x)\ : b <x<b + joh + -}. 

Jb-^ 2 2 



Next consider the case where Oe[6-|,6 + |). Then, by Taylor's expansion. 



h4>{h) 



" 2 



(j){x)dx 



< k'\^"{0)\/24. 



Now using similar arguments for the case '^/3 G (6 - |, 6 + (jo + ^)h) + 0' and using the above 
bounds, one can complete the proof of the lemma. 

Proof of Theorem 2.1: Suppose that (|2.5j> holds. Fix e G (0, 1). By Chebyshev's inequahty, for 
all r G N, 



P 



2\ e 
>7 ^ 4 



(3.25) 



By Lemmas 3.1 and 3.3, for any r G N with /r < ^, 



Air(e) 



sup 

■^<a<fe<2 



P a< 



<b] - [$(6) -$(a)] 



< 



E 



2g-r 



|2^<fc-n,.pr<- 

+ E 

-^<a<fe<; 



a<Jr <k — rirPr ^ ^CTr 



< 



— 1^ exp — ) exp 



(A; — rirPrY 



at. 



[$(6) - #(a) 



l_C_ 

2 (Tr 



+ 



C7 



< — 



oo 
-oo 



(x)|(ix + 1 



+ 



X 



exp I I dx + 1 



^1 1 

provided — < 4. Hence, there exists an ro G N such that for all r > ro with /r < 2 

Air(e) < \- 

Also by Mill's ratio, ^>(-|) + 1 - $(|) < e0(|). Hence, using (|3.25l) and the above inequalities, it 
can be shown that for all r > tq with < ^, 



Ar(e) < e. 



(3.26) 
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Next suppose that fr > \- Consider the collection of Nj. — rir objects that are left after the 
sample of size has been selected from the population of size N^. Let 1^ =the number of 'type 
A'-objects in this collection. Then, for all r G N and j £ Z, 

Yr ~ Hyp{Nr - Ur] Mr, Nr), and P{Xr = j) = P{Yr = Mr- j). (3.27) 

Hence, 

k k 
P{Xr <k)=Y, =j) = Y. P(^r = - j) = P{Yr >Mr-k). 

j=0 j=0 

Further, note that Var(Yr) = {Nr — nr)prqr — ^jv,-"'' ) ~ ^r- Hence, for each a; G R, 

P ( — I!iI!L < x\ = P {Xr < rirPr + XCTj.) 



(Tr 



P{Xr < \nrPr + XfJ^ J ) 
P{Yr > Mr — [rirPr + XCTj-J ) 

p ,' Yr- {Nr - nr)pr ^ Mr - [UrPr + Xar\ - {Nr - nr)pr 



Gr CTf. 

= P^T > Xr) (say), 

where Yr = ^r-(Nr-nr)pr ^ ^ Mr^lnrPr + Xar\-{Nr-nr)pr _ ^^^^ ^^^^ 

Xr < [NrPr — {flrPr + XGr — 1) — NrPr + UrPr] = —X + C7~^ 

(Tr 

and similarly, Xr > —x. Hence, this implies, 

P{Yr < Xr) < P{Yr < Xr) < P{Yr < -X + a'^) 

and 

P{Yr < Xr) > P{Yr < -x) > P{Yr < -X - a'^). 

Now using the above identity and inequalities, we have 

P [^^.ZJh^ <X^- $(x) = \P{Yr > Xr) - (1 - H-X))\ = |$(-x) - P{Yr < Xr)\ 

< max{|P(i; <-x- a;^) - $(-x - a;'^)\,\P{Yr <-x + a^^) - $(-x + C7^^)|} 

+ max{|$(-x) - $(-x - 0-7^)1, |^>(-2;) - <^{-x + cr^^)!}. (3.28) 

By repeating the arguments leading to (|3.26j) . it follows that there exists ri G N such that 
for all r > ri with (1 — fr) < ^, 

sup \P{Yr <x)- $(x)| < e. (3.29) 

Hence, (jlll) now follows from (tTTHl. dCTll . dT^ and (ICTl) . with W ~ N{0,1). In particular, if 
1)2. 5|1 holds, then one must have fi = and a = 1. 
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Conversely, suppose that (|2.4I) holds for some /i G R and a G (0, cxd). Then, for any sequences 
{ar}r>i;{^r}r>i C R with < br for all r > 1, 



P I ar < 



<br) - P{ar <W <br 



< 2/S.r ^0 as r ^ oo. 



(3.30) 



If possible, suppose that ar <l infinitely often. Then, we can pick 0^,6^ G [—1)1] such that for all 
such r, Or — 6r = 1 and 



I rirPr I — UrPr , I TlrPr I + 1 — UrPr 
'- -' — < ttr <br < — 



Then, 



but 



n I ^ -^1" UrPr ^ , \ n 
F \ Qr < < Or = 



(Tr 



P (or < < 6r) > inf{P(a <W <b):a,be [-1, 1], 6 - a = 1} > 0, 

infinitely often. This contradicts (|3.3()|) . Hence, we may suppose that Cr > 1 for all but finitely 
many r's. 

Now define = L"'-P'-J-"-P'-+i ^nd br = Kp^J-^-P^+i . Since P{Xr G {0, 1, . . . , nj) = 1, 



P\ar < 



<br]=P[ [UrPr] + - < < [rirPr] + 3 ' = 0" 



Next using the definitions of a^, br, and the fact that 'x — 1 < [xj < x for all x G R', we get 



2 2 

- — <ar < br < 1^ — , r > 1. 

6ar oar 



(3.31) 



By (HOni) and (HUni . it follows that 



1 2 /■'''■ 

- — min{</)^(x - /Lt) : |x| < - — } < / (j)„{x - fi)d: 

oar oar Jar 



P{ar <W < br) 

f Xr- UrPr 

P \ar < 

V CFr 

as r ^ oo. 



<br] - P{ar <W <br 



As a result, cr^ —> oo as r ^ oo and (2.5) holds. This completes the proof of the theorem. 



To ensure economy of space, we shall first give a proof of Theorem 2.3 and then outline the 
main steps in the proof of Theorem 2.2. 



Proof of Theorem 2.3: Let r G N be an integer such that (2.13) holds. Since r will be held 
fixed all through the proof, we shall drop r from the notation for simplicity, and write fr = f, 
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ar = cr, pr = p, Qr = Q, nr — n, etc. First, suppose that / < 2- Consider the case x < 0. Let 
i^ = _|t= = fc^, A: = 0,l,...,n. Define 

Kq = sup{A; G Z_|_ : Xfc < 0} 

Ki = supjA; e Z_|_ : Xfc > — 1} 

-fr2 = supjA; G Z-i- : > —6a} and 

= [np + xcrj , x G R, 



where 6 = 6r ^ (0, |] is as in (2.12). Note that by definition, 

Ki - 1 < np - a < Ki, K2 - I < np - 6a'^ < K2, 

Xj G [-1,0] for all Ki < j < Kq and Xj G [Sa, -1) for ah K2 < j < Ki. 



Hence, for any x G [—6a, 0], 

' X — np 



P 



a 



< x] - ^{x] 



\P{X < J,) - 



< P{X <K2)+ J2 

j=K2 

= /1+I2 + /3, say 



[Xn 



a 



+ 



E 

j=K2 



a 



(3.32) 



Consider I2 for x G [— (5cr, — 1). Note that for x < —1, "^^^"^ < x < —1. Hence J^, < Ki and 
Xj < — 1 for all j < Jx- From Lemma 3.1, 



\r*ij)\ < 



1 



6(j2(l - 6) 



+ 



I ~ |2 I - 

Fjl _|_ 



1 2J 

+ 



2a a^ [4 (1-6)' 



+ 



2a 



-A 



(3.33) 



where A = ai (^1 + 



4(1+5) 



and 



ai = air = 4(^j^^j) (cf. (2.12)). For the given choice of 6, it is easy 



to verify that 6 < and 6 A < .59. Hence 



x^ 



\r*ij)\ < (0.2)a-2 + ^ 

< m)a-' + f 



^ + -^(0.3667) +M 



a a^ 



min{0.86, — + 0.59} 
ba 



Now, from (E3SI), for all K2 < j < Ki, 

\r*{j)\ < {0.2)a~^ + \xjf 
Next note that 



1 1 , , 3ai 

— + ^ 0.3667 + — 
cr 



2a ' a2 
Jj; - np 



a 



(3.34) 



(3.35) 



a 



< X G R, 



y 2/3exp( — |-)d?/ = ^(l + 6a2)e-'"^ for all a,6G(0,oo), 
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and that for any a G (0,oo), the function g{y;a) = y^exp(— ay), y G [0, oo), is increasing on 
[0, y^], and decreasing on (y^, c«). Hence, by Lemmas 3.1 and 3.2, 1)3. 34(1 and ()3.35|) . with 
c = .07, we have 



J. 

^2 < E 



< 



< 



< 



< 



^exp(r*(i))-^ 
a a 



E 0(5,)k*a)|exp(|r*(j)|) 

j=K2 

4ai 



Jx 



27ro-2 



exp((j ^) E jijl^ exp(— cx^ 



4oi exp(cj" 



2\ - Jx — np 



K'2 — np 



|y|^exp(-c|y|)d?/ 



H — max{|y| exp(-c|y|) : K2 < np + ay < Jx} 
a 

C 



^(1-/) 

Also, for — 1 < X < 0, by Lemma 3.1, 



(1 + x^) exp(— cx^) 



(3.36) 



Ai(x) ^ 



P(-l<^-"^<x 



Ko 

E 



a 



Ko 

< E 



P{X = j) - -<P{xj] 



For Ki<j<Ko, from (ICTl) and (ITMll . 

k*(i)l < 





■ 1 


< 






.2a 




■ 1 


< 






.2a 




1 


< - 


- + 



A 



+ — , + (0.43)r 



111 ^ 
— + — + — ^(0.3667) + — 
5a2 2a 2a2 ^ ^ 2a, 

1 1 0.3667 3ai 
A — + — ^ + ^ + 



2a 5a2 a^ 



a 



a 



(.43)5 



A 



4ai 



Hence, for — 1 < x < 0, noting that Kq — Ki < a, 



Ko 



|Ai(x)| < E exp(-x2(0.07))exp(a-i) 



5ai 



27ra2 



5ai 



< (i^o-i^i)exp(a-i^ " ' 



27ra2 



< — . 

a 



Thus, the bound (|3.36|) on I2 holds for all x G [—(5a, 0]. 



(3.37) 
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Next consider Ii. Note that for j G {0, 1, . . . , n}, 



P{X = j + l) >=< P{X = j) 
Np-j n- j 



j + 1 Nq - n+ j + 1 
Nq+1 

J <=> np ■ 



> 



< 1 



iV + 2 

Thus, P{X = j) < P{X = j + 1) for all < j < np - 1. Hence, by (^U^ and Lemma 3.1, 

K2-1 

j=0 

< K2P{X = K2) 
1 



(3.38) 



< K2-cl)ixK,)exp{r*{K2)) 
a 



< 



< 

< 
< 



27ra 
K2 



exp f exp 



6a--] (0.07) 



■ exp(-(5V^(0.07) + 25(0.07) + 0.13(t 

■ exp(-(5V2(0.07)) exp(0.014) 



27ro- 
np 



2TTa 



< (Q(l-/))-Vexp(-5V2(0.07)). 



It is easy to check that, 

(7exp(-5V2(0.07)) 



(1 + x2)exp(-x2(0.07)) 



< 



WmW^ : if X G [0, 



Hence, it follows that for all x G [— (5a, 0], 

C 

h < 



■(1 + x2)exp(-x2(0.07)). 



(3.39) 



6^q<j{l - f) 

Next note that by definition, xj^ < x and XK2 ^ + (7^^. Hence, for x G [—5(7,0], by Lemma 
3.3, 



h < 



< 



1 r 



<Piy)dy 



+ 



$(x) - $ (xj^ + i2a)-^) + $ (x^, - (2a)-^ 



12a2 



II II 1 

" [(A {y)\dy + 5max{|(?:) (y)| : -00 <y <x + — } 

, 2(T 
+ cI>(. + ^)-c(,_^)+C(_,, + ^). 



Note that for any o G (0, 00), 



'dy < - 
a 



y e 



{-4)^ _ 2 f°° ^ _t 0^ + 2 _<£ 



dy = - te dt 



e 2 ; 



2 
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oo ,,2 



y e 2 dy< I y^e 2dy<\l-; 



max{|0"(y)| : a < y < oo} < -^/(O < a < VS) + |(?!)"(a)|7(a > Vs); 

V 27r 



exp - 



(a - (2cr) 



< exp 



a a 



a2 <5 



+ — < exp — ^ + o ) for all a e (0, (5(7). 



2 2a 



2 2 



Also note that, for < a < 1, 6 G (0, oo), 



1 - m < -<p{h), 

1 - $(a) < / ct){x)dx + < (^(a)(l - a) + 0(a) = (2 - a)(/)(a). 

J a 

Thus, for any x G (0, oo), 

$(a;) < e 2 . 

Since (2(7)"^ < | and \y + (2(t)~''^| < \y\ for y < — |, we have, for all x G [— (5a, 0], 

h < ^^[2I{-2<x<0) + 5\x\(t>{x + {2a)-^)I{-6a<x<-2) 

+ 5 ( -^I(-2 < X < 0) + (x^ + l)(/)(ar + {2a)~'^)I(-Sa < x < -2) 
I v 27r 



+ -^^I{-2 <x <0) + -(j)(x + (2(7)"^) /(-5a < X < -2) + $(-(5(7 + (2c7)"^) 
v27r(7 (7 V / 



< ;i-/(-2 < a; < 0) + 2 • 

2(7 



x2 + 1 1 



2(72 



+ 



i| (^^ + 1.^ /(_<5^ < X < -2) + exp l^-i^^-^iLj 



< -(1 + 1x1) exp -y . 



(3.40) 



Next note that 



and for — ^ < X < —6a, 



X — np 



a 



< X ) = for all X < - — 



P (^^^-^ < x^ < h < (g(l-/))-Vexp(-(5V2(0..07)) 



= - exp ( - SVil - ff [^J"(0-07)) 

< {6\{l - f)a)-\x\^ exp ( - 5^1 - ffx^Om)) . 



Hence, for all x < —(5(7, 

' X — np 



< X — $(x) 



X 



< 

< 



' exp(-52g2(i _ ffx^iO.07)) + exp (-^) 
Sqil - f)<J 
- -2gxp(-(5V(l-/)V(0.07 



5q{l - f)a 



(3.41) 
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Now using the fact that 6 £ for all / G (0, i], from (nOHll . dOTI) and (TMl - imTI) . it 

follows that there exist numerical constants Ci and C2, not depending on n,M,N, such that for 
all X S (— cxD, 0], 

/ Y — n rt \ 

<i>(x) 



< — (l + x^)exp(-C2ga;^), 
erg 



provided 6a > 1. This proves (|2.14j) for x e (— oo,0] and / < ^. 
To prove the theorem for x > and / < ^, define 



r G N. 



Note that has a Hypergeometric distribution with parameters nr,Nr — Mr,Nr. Further, 



for all r e N. 



Hence, the derived bound on the right tails of En^li^L^ ^an be obtained by repeating the arguments 
above with replaced by Vr and pr replaced by for any r such that 5ar > 1. This proves H2.14() 
for X G [0,00) and / < ^. The proof of (|2.14|) for '/ G [i, 1] and x G R' follows by replacing the 
above arguments with Xr,fr replaced by 1^,1 — respectively and using the bound (|3.27|) and 
H3.28() . This completes the prrof of the theorem. 



Proof of Theorem 2.2: As in the proof of Theorem 2.3, first we suppose that fr By (3.1) 
(3.32), (3.36), (3.37), and (3.40), it follows that for all r with 5^0"^ > 1, 



sup |A;(x)| < P{Xr<K2)+ sup {h + h} 

< P{Xr<K2-l) + —. 

(Jr 

By Chebyshev's inequality, noting that K2 — 1 < rirPr — SrCr^ < K2, we have 



Also, 



P{Xr <K2-l) < P 
< 



(Jr 

Var{Xr 



> 



K2 — rirPr — 1 



< 



< 



{K2 - 1 - rirP' 

Nr(jl 



Nr - 1 



{SrCJ^. 



2^-2 



(3.42) 



(3.43) 



sup 

-00<X< — Sr 



|A;(x)| < P{Xr<K2-l) + ^{-5rar 



< 



c 

6r(Jr 



(3.44) 
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Since Sr > 2T5 ^ with < ^, from ()3. 42(1 - 441) . it fohows that there exists a universal 

constant C3 such that for all r with drCTj. > 1 and /r < ^, 

sup|a;(x)| < ^. 

Now retracing the arguments in the proof of Theorem 2.3 for the case "x > 0, fr < 5" (with the 
variable Vr) and for the case "x G R, / > |" (with Y^), one can complete the proof of Theorem 2.3. 

Proof of Corollary 2.4: Use (|2.14|) and the inequality "exp(a;) > (1 + x) for all x G (0, 00)". 
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